perm filename SYS4[TLK,DBL]8 blob sn#185645 filedate 1975-11-06 generic text, type C, neo UTF8
COMMENT ⊗   VALID 00035 PAGES
C REC  PAGE   DESCRIPTION
C00001 00001
C00004 00002	.DEVICE XGP
C00006 00003	.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
C00010 00004	.NSEC(Objective)
C00022 00005	.NSECP(Why Choose Math Research?)
C00025 00006	.NSECP(Potential Applications)
C00028 00007	.NSECP(Measuring Success)
C00032 00008	.NSECP(Experimenting with AM)
C00037 00009	.NSECP(Model of Math Research)
C00049 00010	.SSEC(Textbook-Order vs Research-Order)
C00067 00011	.SSEC(Metaphors to other processes)
C00075 00012	.NSECP(Knowledge AM Starts With)
C00081 00013	.NSECP(AM's Representation of Knowledge)
C00100 00014	.NSECP(Flow of Control in AM)
C00110 00015	.SSEC(The Environment)
C00132 00016	.NSECP(Details: Getting AM going)
C00147 00017	.NSECP(Examples of Individual Modules)
C00170 00018	.NSECP(Example: The Modules Interacting)
C00174 00019	.SSEC(A Hypothetical Session)
C00181 00020	.SSEC(Comments on the Session)
C00185 00021	.SSEC(The Session Revisited: An In-Depth Example)
C00188 00022	.SSSEC(Looking at the new concept named "FACTORS")
C00202 00023	.SSSEC(Conjecturing the Unique Factorization Theorem)
C00213 00024	.SSSEC(Proving Existence)
C00224 00025	.SSEC(A Few Other Examples)
C00241 00026	.SSSEC(Considering New Compositions of Operations)
C00262 00027	.NSECP(Parameters: Size / Timetable / Results so far)
C00269 00028	.ASEC(Bibliography)
C00281 00029	.ASSEC(Books and Memos)
C00302 00030	.ASSEC(Articles)
C00312 00031	.ASEC(Appendix 1: Background for readers unfamiliar with Beings)
C00321 00032	.ASSEC(Internal Design of BEINGs)
C00331 00033	.ASSEC(BEINGs Interacting)
C00338 00034	.ASSEC(Aspects of BEINGs Systems)
C00347 00035	.COMMENT table of contents
C00348 ENDMK
C⊗;
.DEVICE XGP

.FONT 1 "BASL30"
.FONT 2 "BASB30"
.FONT 4  "BASI30"
.FONT 5  "BDR40"
.FONT 6  "NGR25"
.FONT 7  "NGR20"
.FONT 8  "GRFX35"
.FONT A "SUP"
.FONT B "SUB"
.TURN ON "↑α↓_π[]{"
.TURN ON "⊗" FOR "%"
.TURN ON "@" FOR "%"
.PAGE FRAME 54 HIGH 89 WIDE
.TITLE AREA HEADING LINES 1 TO 2
.AREA TEXT LINES 3 TO 53
.COUNT PAGE PRINTING "1"
.TABBREAK
.ODDLEFTBORDER←EVENLEFTBORDER←850
.AT "ffi" ⊂ IF THISFONT ≤ 4 THEN "≠"  ELSE "fαfαi" ⊃;
.AT "ffl" ⊂ IF THISFONT ≤ 4 THEN "α∞" ELSE "fαfαl" ⊃;
.AT "ff"  ⊂ IF THISFONT ≤ 4 THEN "≥"  ELSE "fαf" ⊃;
.AT "fi"  ⊂ IF THISFONT ≤ 4 THEN "α≡" ELSE "fαi" ⊃;
.AT "fl"  ⊂ IF THISFONT ≤ 4 THEN "∨"  ELSE "fαl" ⊃;
.SELECT 1
.MACRO B ⊂ BEGIN VERBATIM GROUP ⊃
.MACRO E ⊂ APART END ⊃
.MACRO D ⊂ ONCE PREFACE 100 MILLS ⊃
.MACRO BB ⊂ BEGIN  FILL  ADJUST SINGLE SPACE PREFACE 1 INDENT 0,4,0 ⊃
.MACRO BGIV ⊂ BEGIN NOFILL SELECT 7 INDENT 0 PREFACE 0  TURN OFF "{}" TURN ON "↑↓" ⊃
.MACRO B0 ⊂ BEGIN  WIDEN 2,7 SELECT 8 NOFILL PREFACE 0 MILLS TURN OFF "↑↓α"  GROUP ⊃
.MACRO B7 ⊂ BEGIN  WIDEN 7,7 SELECT 8 NOFILL PREFACE 0 MILLS TURN OFF "↑↓α"  GROUP ⊃
.MACRO W(F) ⊂ SELECT F NOFILL SINGLE SPACE; PREFACE 0; WIDEN 7,7 ⊃
.MACRO FAD ⊂ FILL ADJUST DOUBLE SPACE PREFACE 2 ⊃
.MACRO FAS ⊂ FILL ADJUST SINGLE SPACE PREFACE 1 ⊃
.FAD

.MYFOOT←1
.FOOTSEP←"________________________________________________________________________________________"
.COUNT FOOTNOTE INLINE PRINTING "⊗A1⊗*"
.AT "$$" ENTRY "*" ⊂ XGENLINES←XGENLINES-1; NEXT FOOTNOTE; !;
.SEND FOOT ⊂ TURN ON "[]{" SELECT 7; SPACING 0; PREFACE 0; INDENT 0,10
⊗A{MYFOOT}⊗* ENTRY
.MYFOOT←MYFOOT+1
.BREAK ⊃ ⊃

.MACRO NSECP(A)  ⊂   SSECNUM←0
.SECNUM←SECNUM+1 
.SECTION←"A"
.SKIP TO COLUMN 1
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1{SECNUM}. A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.ONCE CENTER TURN ON "{}"
@5↓_{SECNUM}. A_↓⊗*  
.  ⊃


.MACRO ASEC(A)  ⊂  SSECNUM←0
.SECTION←"A"
.SKIP TO COLUMN 1
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1 A⊗*
.⊃
.TURN OFF "{∞→"   
.ONCE CENTER TURN ON "{}"
@5↓_A_↓⊗*  
.  ⊃


.MACRO NSEC(A)  ⊂  
.SSECNUM←0
.SECNUM←SECNUM+1 
.TURN ON "{∞→"   
.SEND CONTENTS ⊂
@1{SECNUM}. A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.SECTION←"A"
.GROUP SKIP 3
.ONCE CENTER TURN ON "{}"
@5↓_{SECNUM}. A_↓⊗*  
.  ⊃


.MACRO SSEC(A)  ⊂  TURN ON "{∞→"   
.SSECNUM←SSECNUM+1
.SSSECNUM←0
.SEND CONTENTS ⊂
@7               A⊗* ∞.→ {PAGE}
.⊃
.TURN OFF "{∞→"   
.ONCE INDENT 6 TURN ON "{}"
@2↓_{SECNUM}.{SSECNUM}. A_↓⊗*  
. ⊃

.MACRO ASSEC(A)  ⊂  TURN ON "{∞→"   
.SEND CONTENTS ⊂
@7               A⊗*
.⊃
.TURN OFF "{∞→"   
.ONCE INDENT 6 TURN ON "{}"
@2↓_A_↓⊗*  
. ⊃

.MACRO SSSEC(A)  ⊂  TURN ON "{∞→"   
.SSSECNUM←SSSECNUM+1
.TURN OFF "{∞→"   
.ONCE INDENT 1 TURN ON "{}"
@2↓_{SECNUM}.{SSECNUM}.{SSSECNUM}. A_↓⊗*  
. ⊃

.SECNUM←0
.SELECT 1
.INSERT CONTENTS
.PAGE←0
.PORTION THESIS
.NARROW 2,7
.TURN OFF "{∞→}"   
.PAGE←0
.NEXT PAGE
.INDENT 0
.FAD
.TURN OFF "{∞→}"   
.ONCE CENTER
.NSEC(Objective)

.EVERY HEADING(⊗7Automated Math Theory Formation,Doug Lenat,{DATE}     page  {PAGE}⊗*)

Throughout all of science, one of the most important issues is that of
⊗4↓_theory formation:_↓⊗*  concept acquisition,
how to recognize potentially related concepts, how to tie such concepts
together productively,
how to use intuition, how to choose, when to give up and try another
approach,
how to extend, when to define, what to examine, what to ignore,...
These questions are difficult to ⊗4ask⊗* precisely, even in a
single domain. 

For my dissertation, I am investigating
creative theory formation in mathematics: 
how math researchers propose interesting yet plausible
hypotheses, test them, and develop them into interesting theories.$$A mathematical
theory encompasses (i) a basis of primitive objects and activities, (ii) a
foundation of axiomatic constraints on the basis, (iii) a body of definitions
tying together basic and already-defined concepts, (iv) a body of theorems which
are implied by the foundation and the definitions, (v) some interpretations in
terms of either reality or  other theories.*

The experimental ⊗4vehicle⊗* of my research is
a system, a computer program called AM
(for ⊗2↓_A_↓⊗*utomated ⊗2↓_M_↓⊗*athematician),
which can actually do simple mathematical research:
propose new definitions$$ By combining or modifying existing concepts* and 
axiomatizations,$$ Based on AM's earlier successes and/or on simulated
real-world situations *  
propose and (less importantly) prove conjectures,$$ Using the
heuristics discussed by Polya (e.g., analogy, inductive inference from empirical
evidence, planning in an abstract space, double induction) *
use intuition$$ Intuition
will mean 
simulated real-world scenarios. These are predominantly visual analogies
of current system problems. From them, the system can extract "answers"
which, while not justifiable, are quite often correct. *
to justify beliefs and to
understand new observations, and evaluate concepts and theories for aesthetic$$
Aesthetics has components of harmony, unity, elegance, simplicity, closure, etc.*
interestingness$$ Interestingness will be evaluated locally; e.g.,
the concept named "COMPOSITION" knows several features which indicate when any given
composition of relations 
is interesting; one of these is "if the domain and range of the resultant
composition are equal sets,
and such domain/range equality does ↓_not_↓ hold true
for either of the two constituent relations of
the composition".*.
AM's activities all serve to expand AM itself, to enlarge upon a given body
of mathematical knowledge; due to the enormity of the "search space" involved,
AM must use judgmental
criteria
to guide development in the most promising direction.

More specifically, AM will be given 150 particular ⊗4premathematical⊗*$$
Universal, fundamental, prenumerical knowledge. This includes:
(i) mathematics: elementary notions of sets, relations, properties, 
(ii) methods: problems, problem-solving,
proof techniques, analogy, programming, strategies, communication,
(iii) opaque abilities:  criteria
to evaluate interestingness, aesthetics, utility, etc; locate relevant knowledge,
manipulate abstract simulated "intuition" scenarios. *
concepts,$$
What do we mean by a "mathematical concept"?  A few moments of pondering this
should convince the reader that this cannot be answered cleanly.
A circular definition of a mathematical concept might be
"all the things discussed in math books";
an operational definition might be "whatever AM knows about initially, plus
whatever new knowledge it acquires." 
Later, we shall indicate the classes of concepts involved; for now, let us just
mention a few specific ones to indicate the breadth involved. Each of the
following is a mathematical concept:
Set, Relation, α{1, frob, α{α}α}, Zorn's Lemma, Theory, Union, Prove, Proof, Theorem,
The Unique Factorization Theorem (UFT), The proof of UFT, The methods used to
prove UFT, Constructively proving existence, Associativity.*
and will 
then study their facets, occasionally deciding that some facet is worth
naming and storing as a new concept. 

There is no specific target or goal that AM should acheive$$ Prior
research [Lenat, IJCAI75] has convinced me of the danger in choosing a
too-well-defined goal.
Even subconsciously, one "builds in" too much  task-specific knowledge,
instead of more general domain-specific abilities.*
The basic criteria for evaluating AM will be the traditional harsh standards
used to judge any mathematical work: are the new mini-theories interesting,
aesthetic, useful,...? 

To indicate the level of sophistication -- and naivete -- expected of
AM, a sketch of
one possible line of development will be presented.
Some guidance by a human user might be required to push AM in such a
"traditional" direction.
AM soon develops the concepts of singleton, function, inverse relation, 
and set-equivalence (Cardinality).
These pave  the way to the domain
of (elementary) number theory.
Eventually, AM  develops enough mathematical systems to
notice a common structure, which are abstracted into something like the
group axioms. AM then begins exploring elementary abstract algebra.


AM is almost completely specified on paper$$
Each of the 150 concepts to be initially supplied has about one page of
information; all of this is found in the file GIVEN[TLK,DBL] at SAIL.*.
The control structure for the system is programmed, and many concepts have
already been introduced$$ The
latest Interlisp program is stored on the <LENAT> directory at SUMEX; it
it is a group of 2 files: TOP5 (the control functions) and CON5 (the actual
concepts). To run  it, simply enter Interlisp and load  those two files.*.
Some nontrivial behavior already evinced (November, 1975) includes
(1) the concept of "a set, all pairs of whose elements satisfy a given 
predicate"; (2) the concept of a composition of relations whose initial domain
and final range intersect each other; (3) the predicate "sets of equal length"
-- i.e., Cardinality. Of course, many uninteresting new concepts have developed,
and these are all still too primitive to form any conclusions.
Hopefully, some definite results will be obtained this Winter.

Such a project immediately raises many issues; they are listed below, and
each one is then dealt with in a separate section.

.BEGIN NOFILL INDENT 3 PREFACE 0 SELECT 2 GROUP; ONCE PREFACE 2

Why choose mathematics research as the domain for investigating theory formation?
What are some potential concrete applications of this project?
How will the success of the system be measured?
What experiments can be done on AM?
What is our model of math research? ⊗7To automate something, you must have a good model for it.⊗*
What given knowledge will AM initially start with?
How will this knowledge be represented?
What is the control mechanism; what does AM "do"?
Examples of individual knowledge modules. ⊗7One module for each concept.⊗*
Examples of communities of modules interacting, developing new modules.
What is the timetable for this project?   What is the current state?
.END

.EVERY HEADING(⊗7Automated Math Theory Formation,Doug Lenat,{DATE}     page  {PAGE}⊗*)

.NSECP(Why Choose Math Research?)

As the DENDRAL project illustrated so clearly$$ see Buchanan,
Feigenbaum, et. al.  Choice of subject was enabled by J. Lederberg in 1965.*,
choice of subject domain is
quite important when studying how researchers discover and develop their theories.
Mathematics has been chosen as the domain of this investigation, because

(i) In doing math research, one needn't cope with
the uncertainties and fallability of testing equipment;
that is, there are no uncertainties in the data (compared to, e.g., chemical
identification from mass spectrograms).

(ii) Reliance on experts' introspections is one of the most powerful techniques
for codifying the judgmental criteria necessary to do effective work in a field;
I am enough  of an expert in 
elementary mathematics so that I
won't have to rely on external
sources for guidance in formulating such heuristic rules.

(iii) A hard science is of course easier to work in than a soft one; to
automate research in psychology would require more knowledge about 
human information processing than now is known, because psychology deals with
entities as complex as you and I. Also, in a hard science, the ⊗4languages⊗* to
communicate information can be simple even though the 
⊗4messages⊗* themselves be sophisticated.

(iv) 
Since mathematics can deal with any conceivable constructs, a researcher there is
not
limited to  explaining observed data.
This reason is much less influential in practice than would be expected:
It turns out that most unmotivated forays are fruitless.
Most of the successful, interesting mathematical theories are inspired by reality$$
Or by other math theories which are well-established already. It is too philosophical
an issue to argue whether or not such theories constitute part of reality, so I
merely note them as a possible exeception.*.
.NSECP(Potential Applications)

(i) The modular representation of knowledge that AM uses might prove to be effective
in other large knowledge-based systems.

(ii) Most of the particular heuristic judgmental criteria for interestingness,
utility, etc., might be valid in developing theories in other sciences.

(iii) If the repertoire of intuition (simulated real-world scenarios)
is sufficient for AM to develop elementary concepts of math, then educators
should ensure that children (4-6 years old) are thoroughly exposed to those
scenarios.$$ These tentatively include situations like seesaws, slides,
piling marbles into pans of a balance scale, comparing the heights of towers
built out of cubical blocks, solving a jigsaw puzzle, etc.*

(iv) We might learn something about how the theory formation task changes
as the theory grows in sophistication. For example, can the same methods which
lead AM from premathematical concepts to arithmetic also lead AM from
there to abstract algebra? Or are a new set of intuitions or criteria required?

(v) An unanticipated but possible result would be the proofs of -- or at least the
statements of -- interesting new theorems or even whole theories.
This might also take the form of a redivision of existing concepts, an alternate
formulation of some established theory, etc.
.NSECP(Measuring Success)

(i) By AM's achievements: Compare the list of concepts and methods it develops
against the list of concepts and methods it began with.
Did AM ever discover anything interesting yet unknown to the user?$$ 
The "user" is a human works with AM interactively, giving it hints, commands,
questions, etc.
Notice that by "new" we mean new to the user, not new to Mankind. 
This might occur if the user were a child, and AM discovered
some elementary facts of arithmetic.
This is not really
so provincial:  mathematicians take "new" to mean new to Mankind, not
new in the Universe.  I feel philosophy slipping in, so this footnote is
terminated.*

(ii) By the route AM took to accomplish these advances:  How clever, how circuitous,
how many of the detours were quickly identified as such and abandoned?

(iii) By the character of the User--System interactions: How important is the user's
guidance? How closely must he guide AM? What happens if he doesn't say anything ever?
When he does want to say something, is there an easy way to express that to AM,
and does AM respond well to it?
Given a reasonable kick in the right direction, can AM develop the mini-theories
which the user intended, or at least something equally interesting?

(iv) By its intuitive powers: when forced to make a snap judgment on some issue,
which side does AM choose, and why?  Are the conjectures it tries to prove usually
true?$$ In fact, I hope AM tries to prove an intuitively clear yet false
proposition.*  How accurately does AM estimate the difficulty of tasks it
is considering?  How big a help is the intuitive belief in a conjecture
when trying to formulate a constructive proof of that conjecture?
Does AM tie together (e.g., as analogous) concepts which are formally unrelated
yet which benefit from such a tie?

(v) By the ability to perform the experiments outlined in the next section.
Regardless of the experiments' outcomes, 
the features of AM which allow them to be carried
out at all would be interesting in themselves.
.NSECP(Experimenting with AM)

Assume that AM has been written, and has in fact developed some new$$
New to AM, not to the world. It would be quite accidental if AM proved 
theorems unknown to Mankind.*
concepts on its own. 
Here is a list of some of the more interesting experiments that would be
possible to perform, using AM:

(i) Remove individual concept modules. Is performance noticably degraded?$$
AM should operate even if most of its criteria have been
"turned off"
and most of its modules excised,
so long as ↓_any_↓ parts of any of the modules remain
enabled. 
If the remaining fragment of AM is too small, however, it may not be
able to find anything interesting to do. In fact, this situation has
been encountered experimentally, when the first few partially complete
modules were inserted. If only some abilities, criteria are turned off,
AM may in fact keep operating without this "uninteresting collapse".
For example, if all but
the formal manipulation knowledge is removed, the
system should still grind out
(simple) proofs. If all but the analogy and intuition criteria are excised,
some plausible (but uncertain) conjectures should still be produced and
built upon. 
If these forces are buried deep in an Environment, they should be tunable
(by the creators) to almost negligibility, so the same experiments can still be
carried out.
The converse situation should also hold: although still functional with any module
unplugged, the performance ↓_should_↓ be noticably degraded. 
That is, while not indispensible, each module should nontrivially help the others.
For example,
the job of proving an assertion
should be made much easier by the presence of intuitive understanding. If
a constructive proof is available, the necessary materials will already be
sketched out for the formal methods to build upon. *
Which concepts does AM now "miss" discovering? Is the removed concept
later discovered anyway by those which are left in AM?  This should indicate
the importance of each kind of concept (and method) which AM starts with.

(ii) Vary the relative weights given to features by the criteria which judge
aesthetics, interestingness, worth, utility, etc.  See how important each factor
is in directing AM along successful routes.

(iii) Add several new concept modules (and a few new intuitions) and see if AM
can work in some unanticipated field of mathematics (like 
graph theory or calculus).

(iv) Add several new intuitions, and see if AM can develop nonmathematical
theories (elementary physics, program verification). This would also require
limiting AM's freedom to "ignore a given body of data and move on to something
more interesting".

.NSECP(Model of Math Research)

In order to intelligently discuss how to automate an activity, we must be very
clear about how humans do that activity. Thus, for AM, we must begin by hypothesizing
a particular model of how mathematicians actually go about doing their research.
After presenting our model, we can then discuss how to automate the processes
involved.
Thanks to Polya, Kersher, Hadamard, Skemp, many others, and introspection,
I have pieced together a tentative such information processing
model for math theory formation:

.BEGIN INDENT 0,3,0 
.ONCE SELECT 8 TURN OFF "α" TURN ON "∞→"
⊂∞α→⊃
.NARROW 2,2  SINGLE SPACE PREFACE 0

1. The order of events in a typical mathematical investigation is:
(a) OBSERVE:
The observation is either of reality or of an analogous, already-established
mathematical theory.
(b) NOTICE REGULARITY: Perceive some patterns, some interesting relationships
that appear to hold sometimes.
Thus math research is an ⊗4empirical⊗* process.
(c) FORMALIZE: Decide on some of the objects, operators,
definitions, and statements that the theory will contain.
(d) FINALIZE: Decide which concepts are to be primitve and which aren't;
decide which statements will be considered axioms, and ensure that the others
can in fact be derived from them.
(e) DEVELOP: What additional theorems can be proven from this formal system;
do they correspond to observable phenomena in the domain which motivated
this new theory?  When new observations are made in that motivating domain,
can they be naturally phrased as formal statements in this theory; and if so,
are they provable from the existing axioms and theorems?


2. 
Notice that each step in (1) involves choosing from a large set of alternatives
-- that is, searching.
The key to keeping this from becoming a blind, explosive search is the
proper use of evaluation criteria. That is, one must constantly choose the
most interesting, aesthetically pleasing, useful,... alternative available.
This is analogous to Dendral's reliance on good heuristics to constrain the
structure generator.

3. 
But many of those criteria are usually opposed to each other 
(e.g., one often must sacrifice
elegance for utility, interestingness for safety, etc.). How should one
weight these features when deciding what to do next  during an investigation?
We believe (and one goal of AM is to test) that
the non-formal criteria (aesthetics, interestingness, inductive inference (from
empirical evidence), analogy, intuitive clarity, utility)
are much more important than formal
deductive methods in developing mathematically worthwhile theories, and
in avoiding barren diversions.
Among the subjective criteria, the  order listed above is roughly their order
of importance. However, AM should have a dynamically variable "orientation",
which under certain circumstances might induce it to seek safety (e.g., utility)
rather than uncertainty (e.g., proposing an analogous proposition).

4. The above criteria are virtually the same in all domains of mathematics,
and at all levels. Each factor encourages some pursuits and discourages others.
It is hoped that no modifications need be made to AM's judgmental scheme,
as AM acquires more and more new concepts.

5. For true understanding, AM should grasp$$  Have access to, relate to, store,
be able to manipulate, be able to answer questions about *
each concept in several ways:
declarative (definition), operational (how to use it), demonic (recognizing
when it is relevant), exemplary (especially boundary examples), and
intuitive (simulated image of a real-world interpretation).

6. Progress in ⊗4any⊗* field of mathematics demands much intuition (and some
formal knowledge) of ⊗4many⊗* different "nearby" 
mathematical fields. So a broad,
universal core of intuition must be established before any single theory
can meaningfully be developed.
Intuition is contrasted with more formal representations by the fact that it
is opaque (AM cannot introspect to determine how the result is produced) and
fallable.

.WIDEN 2,2
.ONCE SELECT 8 TURN OFF "α" TURN ON "∞→"
%∞α→$
.END

.SSEC(Details)

This last point (6) merits elaboration. I believe that to develop any given field
of mathematics, one must possess much intuition about each
⊗4psychologically⊗* prerequisite field,$$ One which makes the new field
easier to learn, which contains more concrete analogues of the ideas of the new
field. For example, knowing about geometry makes it easier to learn about
topology, even though topology never formally uses any results or definitions
from geometry. * and some definite facts  about each ⊗4formally⊗* preceding 
field$$ For example, arithmetic is usually formally based upon set theory,
although most of us learn about them in the other order *  of mathematics.  
The diagram here
indicates the prerequisites for each domain which might
conceivably be worked in by AM.   To start in a given domain, some knowledge
should exist about all domains which point to the given one,
about all domains which point to ⊗4them⊗*, etc.

.GROUP SKIP 3

⊗4NOTE: This section, and especially the one after it, are
supplementary, and may be skipped on first reading.⊗*

.B0


Elementary Logic  ααα→  Theorem-Proving  ααααααααααααααα⊃
    ↑							~
    ~							~
    ~							~
    εαααααααα→  Geometry  ααα→  Topology		~
    ~		    ~		↑      ~		~
    ~		    ~		~      ~		~
    ~		    ↓	        ~      ↓ 		~
    ~      Analytic Geometry    ~   Algebraic Topology	~
    ~		 ↑              ↓      ↑		~
    ~            ~ Measure Theory      ~		~
    ↓	         ~ ↑ 		       ~		~
Boolean Algebra αβαβαα→  Abstract Algebra 		~
    ↑            ~ ~      ~				↓
.ONCE TURN ON "α"
    ~	         ~ ↓      ~		  Program Verification
    ~	    Analysis      ↓				↑
    ~		   ↑     Concrete Algebra		~
    ~              ~      ↑				~
    ~		   ~      ~				~
    ↓		   ~      ~				~
Set Theory  ααα→  Arithmetic  ααα→  Number Theory	~
		      ~					~
		      ~					~
		      ↓					~
		Combinatorics  ←ααα→  Graph Theory  αααα$
.E


Each arrow represents either a psychological or a formal prerequisite:
To work in the field pointed ↓_to_↓, one should know something about the field
pointed ↓_from_↓.
Notice that almost all the "elementary" 
branches of mathematics are either formally
or psychologically prerequisite 
to each other.$$ For example, one should have some intuition about 
sets before doing Number theory,
and one should have some intuition about numbers 
and counting before doing Set theory. * So a broad foundation of intuition, 
spanning ⊗4↓_several_↓⊗* mathematical and 
real-world concepts, is prerequisite to sophisticated behavior in ⊗4any⊗*
branch of mathematics.
This smacks of  the idea of a "critical mass" of knowledge,
so often sensationalized in science fiction. In addition to 
expecting that the corpus must exceed some minimum absolute size,
I am claiming that it must also exceed some minimum degree of richness, of breadth.

.SSEC(Textbook-Order vs Research-Order)

Let us elaborate on point (1) of the model. The idea there is that
the order in which a math textbook presents a theory is almost the exact
opposite of the order in which it was actually discovered and developed.

.B0  INDENT 10

PRIMITIVES, AXIOMS ααααα→ DEFINITIONS    ←αααααααααααααα⊃
				~			~
				~			~
				↓			~
			STATEMENT OF LEMMA 		~
				~			~
				~			~
				↓			~
			  PROOF OF LEMMA		~
				~			~
				~			~
				↓			~
				~			~
				~			~
				↓			~
		     STATEMENT OF NEXT LEMMA		~
⊂αααααααα⊃ 			~			~
~textbook~      		|			~
~ order  ~			|			~
%αααααααα$			|			~
				|			~
				|			~
				↓			~
		       PROOF OF LAST LEMMA		~
				~			~
				~			~
				↓			~
		      STATEMENT OF THEOREM		~
				~			~
				~			~
				↓			~
			PROOF OF THEOREM ααααααααααααααα$

.E

In a textbook, one introduces some primitive elements of the theory, postulates
some axioms relating these objects and operations, then enters a 
⊗4"Define → State → Prove⊗*" loop. Lemmas are stated and proved before the
theorems which require them. Motivations of any kind are a rareity,ro 0

Can replace each marble in a pile\We can replace each block in a tower
by a copy of some other pile.\by a copy of some other tower.→mult x

A lone marble is also a pile.\A lone block is also a tower.→one 1

etc.\etc.→etc.

***********************************************************************************
.E

These correspond to the objects and operators of the theory he is about to develop.
He gives them each a name; I have permitted him to choose the common name for
each concept, but that doesn't matter, of course. Notice that most of these
are not usually considered primitive at all. 

The next list he draws up contains several specific observations, translated from
the blocks/marbles manipulation language into these new symbols.
Each relationship is translated from these symbols into the
⊗4other⊗* scenario's language, and checked. Thus all the observations here are
meaningful in both of the manipulative domains:

.BEGIN NOFILL SELECT 6 PREFACE 0 INDENT 0 TABS 15,30,45,60,75 TURN ON "\" GROUP

************************************************************************************************

0=0\1=1\¬(0=1)\¬(1=0)\¬(S(1)=1)\¬(0=S(0))

S(0)=1\1=S(0)\S(0)=S(0)\S(1)=S(1)\¬(S(0)=S(1))\¬(S(1)=S(0))

0+0=0\1+1=S(1)\0+1=1\1+0=1\S(1)+0=S(1)\S(0)+S(0)=S(1)

0x0=0\1x1=1\0x1=0\1x0=0\S(S(1))x0=0\S(S(1)x1=S(S(1))

1>0\S(1)>1\S(1)>0\¬(0>1)\¬(1>1)\¬(0>0)

************************************************************************************************

.E


Now comes some simple generalization, either directly from the scenarios,
perceptually from the list just compiled, or syntactically from that list
(e.g., by replacing constants by variables). 
Again,  each of these generalizations
is intuitively checked in both real-world domains.

.BEGIN NOFILL SELECT 6 PREFACE 0 INDENT 0 TABS 15,30,45,60,75 TURN ON "\" GROUP

*************************************************************************************************

If n>0, then  D(n) can be performed; else D(n) cannot be performed.

If n>0, then n>D(n)\S(n)>n\\S(n)=n+1\S(S(n))>n

If n=m then S(n)=S(m)\\\If S(n)=S(m) then n=m

D(S(n))=n\\If n>0 then S(D(n))=n\If n>0 then n+m>m

m+S(n)>m\\If n=m then ¬(n>m)\If n>m then ¬(n=m)

If n=m ∧ n>r then m>r\n=n\¬(n>n)\¬(n=S(n))

If n>m ∧ m>r then n>r\If n=m ∧ m=r then n=r\If n=m ∧ r=m then r=n

If n=m then m=n\If n>m then ¬(m>n)\n>m ∨ m>n ∨ n=m

nxm = mxn\\nxS(m)=nxm + n\\n+m=m+n

n+S(m)=S(n)+m\\(n+m)+(r+s) = n+((s+m)+r)

nx(mxr)=(nxr)xm\S(n)+D(m)=m+n

etc.

*************************************************************************************************

.E

Some of these are now considered as axiomatic, as defining the operations and
structures involved; the rest are considered secondary, as theorems following from
the axioms. This process is a slow search.
The most important game is not "finding minimal axiom sets", but rather
"what else can we prove from this set of axioms and theorems".
The AM research goal is to learn how to play this game of theory development.

To summarize idea (1):
The ⊗4motivation⊗* for a new mathematical development is either:
(a) looking at reality and at the mathematics
you have already$$ Then abstract out some interesting features; 
formalize those into a
specific set of primitive structures and relations  and axioms about those
basic entities *, or (b) given an existing theory, propose some new conjecture or
definition, based on analogy to the real-world roots of this theory,$$ Based
on patterns perceived in empirical
evidence *  or based on analogy to a known interesting theorem in another theory,
or occasionally based only on applying some fruitful method and observing the
result.

.SSEC(Metaphors to other processes)

Let's consider some alternate ways of looking at such theory development, some
analogies to processes with which you're probably more familiar.


.SSSEC(Growing a tree -- using heuristic constraints)

Once motivated, the idea is developed and evaluated. At each moment, the
researcher should be working on the most promising of all the proposed ideas.
This process is thus a sophisticated expansion of a tree,
where new conceptual nodes are
"grown" in the most promising area, and where barren dead-ends are eventually
"pruned". To do mathematical research well, it is thus necessary and sufficent
to have good methods for proposing new concepts from existing ones, and for
deciding how interesting each candidate is. That is: effective growing and pruning
strategies.

.SSSEC(Exploring a space using an evaluation function)

Consider our core of premathematical knowledge.
By compounding the known concepts and methods,$$Using 
abstraction from reality, analogy with existing theories,
the postulational method, and problem-solving techniques *
we can extend this
foundation a little wherever we wish.
⊗4<Visualize: SEVERAL SHORT LINES IN BLACK, EMANATING
FROM A CENTRAL CORE>⊗*.
The incredible variety of alternatives to investigate includes
all known mathematics, much trivia, countless deadends, and so on.
The only "successful" paths near the core 
are the narrow ribbons of known mathematics 
(perhaps with a few undiscovered other slivers). 
⊗4<Visualize SNAKE-LIKE LINES IN RED, twisting away from the core, intersecting>⊗*.

How should we walk through this immense space, with any hope of following the
few, slender branches of already-established mathematics (or some equally successful
new fields)? We must do hill-climbing; as new concepts are formed, decide how 
promising they are, always explore the currently most-promising new concept. The
evaluation function is quite nontrivial, and this research may be viewed as an
attempt to study and explain and duplicate the judgmental criteria people employ.
My attempts at codifying such "mysterious" emotive forces as intuition, aesthetics,
utility, richness, interestingness, relevance... indicate that a large but not
unmanagable collection of heuristic rules should suffice.

The important visualization to make 
is that with proper evaluation criteria, we convert the
flat picture  to
a breath-taking relief map:
the known lines of development become
mountain ranges, soaring above the vast flat plains 
of trivia and inconsistency
below.
Occasionally an isolated
hill is discovered near the core;$$ E.g.,
Knuth's ↓_Surreal Numbers_↓ * certainly
whole ranges lie undiscovered for long periods of time:$$ E.g.,
non-Euclidean geometries * the terrrain far from the initial core is
not at all explored.  

Intuition is like vision, letting the explorer observe a distant mountain,
long before he has conquered its intricate, specific challenges.

If the criteria for evaluating interestingness and promise are good enough, then
it should be straightforward to simply push off in any direction, locate some
nearby peak, and follow the mountain range along (duplicating the development in
some field). In fact, by intentionally pushing off in apparently barren directions,
new ranges might be enountered. If the criteria are "correct", then ⊗4any⊗* new
discovery the system makes and likes will necessarily be interesting to humans.

If, as is more likely, the criteria are deficient, this too will tell us much. Before
beginning, we shall strive to include all the obvious factors which enter into
judgmental decisions, with appropriate weights, etc. If the criteria fail, then
we can analyze that failure and learn about one nonobvious factor in evaluating
success in mathematics (and in any creative endeavor). After modifying the criteria
to include this new factor, we can proceed again.

.SSSEC(Syntax and Semantics in Natural Language Processing)

One final analogy: Consider assumptions, axioms, definitions, and
theorems to be 
syntactic rules for the
language that we call Mathematics. 
Thus theorem-proving, and the whole of textbook mathematics, is a purely syntactic
process.
Then these judgmental
criteria I am alluding to would correspond to the semantic knowledge associated
with these more formal methods.
Just as one can upgrade natural-language-understanding by encorporating semantic
knowledge, so AM ensures that it knows the intuitive and empirical roots of
each concept in its repertoire.

.NSECP(Knowledge AM Starts With)

At each stage, AM has some set of concepts (represented as modules,
clumps of information). AM's basic activity is to fill out the existing clumps
more, and develop new clumps. Each clump is called a BEING. Here are
the names of the BEINGs (concepts, clumps, ... ) which AM will start with:


.BEGIN FILL SELECT 6 SINGLE SPACE PREFACE 0 TURN OFF "{}-"
.INDENT 3,7,0

.ONCE PREFACE 3 INDENT 0
(i) ⊗2Objects⊗*

Ordered-pair,
Variable,
Propositional-constant,
Structure,
List-structure,
Bag-structure,
Set-structure,
Assertion,
Axiom

.ONCE PREFACE 2 INDENT 0
(ii) ⊗2Actives⊗*


Operations:  Compose, Insert, Delete, Convert-structure, Substitute, Assign,
Map-structure, 
Undo,
Reverse-ordered-pair, Rule-of-inference, Disjoin, Conjoin, Negate,
Imply, Unite, Set-union, Cross-product, Common-parts, Set-intersection,
Set-difference, Put-in-order, Abstract-out-similarities, Contrast

Relations: Equality, Membership, Containment, Equivalence, 
Equipollence,
Scope, 
Quantification, ⊗7For-all, There-exists, Not-for-all, Never, Only-sometimes⊗*

Properties: Ordered, Extreme, Properties-of-Activities

.ONCE PREFACE 2 INDENT 0
(iii) ⊗2Higher-order Actives⊗*

Find, Select, Guess, Ellipsis, Analogize, Conserve, Approximate

Examine, Test, Assume, Judge,
Define, 
Prove, ⊗7Logically-deduce, Prove-directly, Cases,
Working-backwards, Prove-indirectly, Prove-universal-claims, Mathematical-induction,
Prove-existential-claims, 
Prove-existence-constructively, Prove-existence-deductively,⊗*
Disprove, ⊗7Disprove-constructively, Disprove-indirectly,⊗*

Solve-problem, Debug, Trial-and-error, Hill-climb, Subgoal, Work-indirectly,
Relations-between-problems

Communicate, Communicate-with-user, Translate-into-English-for-User,
Translate-from-English-for-concepts, User-model, Communicate-among-concepts,
Infer-from-examples, Select-representation

Isomorphism, Categoricity, Canonical, Interpretation-of-theory

.ONCE PREFACE 2 INDENT 0
(iv) ⊗2Higher-order Objects⊗*

Statement, Conjecture, ⊗7Statement-of-generality, Statement-of-existence,⊗*
Theorem, Proof, Counterexample, Contradiction, Analogy, Assumption

Problem, Problem-to-find, Problem-to-prove, Problem-to-creatively-define,
Action-problem, Bug, Inference-problem

Representation, Symbolic-representation, Diagram, Scenario

Mathematical-theory, Axiomatic-Foundation, Primitive-Basis, Formal-system

.END
.ONCE PREFACE 3

The more basic the initial core concepts, the more chance there is that the 
research will go off in directions different from humans, the more
chance it will be a waste of time, and the more valid the test of the search-pruning
forces.  Thus we won't even provide AM with the "simple" concepts of function,
inverse, commutativity, counting, etc. We do hope that AM discovers them, or
some equally interesting "low-level" building blocks.

Of course, we must ask ⊗4precisely⊗* what AM knows about each of these concepts.
But to fully answer that, we must digress and discuss how the knowledge
is represented. We must explain the modular BEINGs scheme for representing
information about a concept. This is treated curtly in the next section; a longer,
smoother introduction to BEINGs is found in Appendix 1.

.NSECP(AM's Representation of Knowledge)

AM will represent each concept in its repertoire
as a bundle of facets ⊗4(Definitions, Intuitions, Recognition, Usefulness,
Interestingness, Properties,...)⊗*, 
and each facet will be stored internally as a
little program.  Each concept will have precisely the same set of
25 facets.  This enables us to assemble, in advance, a body of knowledge
(called ⊗4strategic⊗* knowledge) about each facet. Each concept is called
a BEING$$
For historical reasons. BEINGs is a modular scheme for representing
knowledge, akin to ACTORs, 
PANDEMONIUM, SMALLTALK,
CLASSes, and FRAMEs. Details
about the origin of the BEINGs ideas,
can be found in [Green et al], in [Lenat], or in Appendix 1 of this paper.*.
Depending
on how you want to visualize a BEING, its   subdivisions can be called
parts, questions, facets, or slots.
Part P of BEING B will be abbreviated as B.P.


.SSEC(Facets that each concept might have)

Each facet program can be viewed as answering a certain family of questions
about the BEING (concept) in which it is embedded$$ For example, the DEFINITION
facet of COMPOSE should be able to tell if any given entity is a composition.*.
Below is the tentative set of facets that concepts can have, followed by a
brief description of what question(s) each facet answers:


.SKIP TO COLUMN 1

.BEGIN FILL SELECT 7 PREFACE 0 SPACING 0 INDENT 15 GROUP

.ONCE FLUSH LEFT PREFACE 1
⊗4RECOGNITION GROUPING⊗*  Notice when this BEING, call it B, is relevant.

 CHANGES		Is this rele. to producing the desired change in the world?

 FINAL  		What situations is this BEING rele. to bringing about?

 PAST			Where is this used frequently, to advantage?

 IDEN {not}{quick}	{fast} tests to see if this BEING is {not} currently referred to


.ONCE FLUSH LEFT
⊗4ALTER-ONESELF GROUPING⊗*  Know about variations of yourself, how good you are, etc.

 GENERALIZATIONS	What is this a special case of? How to make this more general.

 SPECIALIZATIONS	Special cases of this? What new properties exist only there?

 BOUNDARY		What marks the limits of this concept? Why exactly there?

 DOMAIN/RANGE {not} Set of (what one can{'t} apply it to, what kind of thing one {never} gets)

 ORDERING(Complete)	What order should the parts be concentrated on (default)

 WORTH	Aesthetic, efficency, complexity, ubiquity, certainty, analogic utility, survival basis

 INTEREST		What special factors make this type of BEING interesting?

 JUSTIFICATION   Why believe this? Formal/intu. For thms and conjecs. What has been tried?

 OPERATIONS  Properties associated with BEING. What can one do to it, what happens then?


.ONCE FLUSH LEFT
⊗4ACT-ON-ANOTHER GROUPING⊗*  Look at part of another BEING, and perhaps do something to it.

⊗1CHANGE⊗* subgrouping of parts:

 BOUNDARY-OPERATIONS {not}  Ops rele. to patching {messing}up not-bdy-entities {bdy-entities}

 FILLIN  How to initially fill it in, when and how to augment what is there already.

 STRUCTURE 		Whether, When, How to retructure (or split) this part.

 ALGORITHMS		How to compute this, do this activity. Related to Representation.

⊗1INTERPRET⊗* subgrouping of parts:

 CHECK   		How to examine and test out what is already there.

 REPRESENTATION  How should entities of type B be structured internally? Contents' format.

 VIEWS	(e.g., How to view any Active as an operator, function, relation, property, corres., set of tuples)
 


.ONCE FLUSH LEFT
⊗4INFO GROUPING⊗* General information about this BEING, B, and how it fits in.

 DEFINITION		Several alternative formal definitions of this concept. Can be axiomatic, recursive.

 INTU		Analogic interp., ties to simpler objects, to reality. Opaque.

 TIES   	Alterns. Parents/offspring. Analogies. Associated thms, conjecs, axioms, specific BEING's.

 EXAMPLES {not} {bdy}	Includes trivial, typical, and advanced cases of each type.

 CONTENTS       What is the value stored here, the actual contents of this entity.

.END

.FAD

.XGENLINES←XGENLINES+2

Let us take the organization sketched above as ⊗4literal⊗*; thus all knowledge in the
system is organized in packets called "BEINGs", 
with each BEING containing all the relevant facts and ideas about one single concept.
Each BEING has about 25 different slots or "parts",
with each part having a name and a value. One interpretation of this is that
the name represents a question, and the value represents how to answer that question.
The BEINGs are organized into five categories or "families", 
each containing about 30 BEINGs
to start with. The slots are organized into four categories, each with about
6 differently-named slots.

There is one distinguished structure, called CS (for Current Situation), which
holds pointers to all the recent system activities, notably those which have been
suspended, relevant new information, goals outstanding, BEINGs which were recently
judged interesting but haven't been worked on since then, recent comments from the
user, etc.

.SSEC(Overall View of Representation in AM)

The two broad categories of knowledge are definite and intuitive. To represent
the former, we employ (i) rules and assertions, (ii) BEINGs grouped into families,
and (iii) opaque Environment functions. To represent the latter, we employ
(i) abstract rules, (ii) pictures and examples, and (iii) opaque Environment functions.


Each currently popular Artificial Intelligence formalism for representing knowledge
lies somewhere along (or very near to) the ideal
"intelligence vs. simplicity/speed" tradeoff curve.  Another way to
see this is to picture each representation as some tradeoff between
structure and uniformity, between declarative and procedural
formulations.  Each idea has its uses, and it would be unwise to
demand that any single representation be used everywhere in a given
system.  One problem with the alternative is how to interface the
multiple representations.  Usually this is enough to persuade
system-builders to make do with a single formalism.  The proposed
system will be pushing the limits of the available machinery in both
the time and space dimensions, and therefore cannot indulge in such
luxuries!  Knowledge used for different types of tasks must be
represented by the most suitable formalism for each task. 

BEINGs are higly structured, intelligent, but slow. Rules and
assertions are more uniform and swift, but frequently awkward. 
Compiled functions win on speed but lose on accessability of the
knowledge they contain.  Pictures and examples are universal but
inefficient in communicating a small, specific chunk of information. 

Let us now partition the types of tasks in our system among these
various representations.  The frequent, opaque system tasks (like
evaluating interestingness, aesthetic appeal, deciding who should be
in control next) can be programmed as compiled functions (the only
loss -- that of accessability of the information inside -- is not
relevant since they should be opaque anyway). 

The specific math knowledge has many sophisticated duties, including
recognizing its own relevance, knowing how to apply itself, how to
modify itself, how to relate itself to other chunks of knowledge,
etc. It seems appropriate that BEINGs be used to hold and organize
this information. The main cost, that of slowness, is not critical
here, since each individual chunk is used infrequently, and a wrong
usage is far more serious than a slow usage.  One final factor in
favor of using BEINGs here is that all the knowledge that is
available at the time of creation of a new BEING will find its way to the
right place; any missing knowledge will be conspicuous as a blank or
incomplete BEING part. 

The contents of each part of each BEING is composed of specialized rules,
assertions, simple programs, and pointers to other parts and other BEINGs. The
knowledge may have to be altered from time to time, hence must be
inspectable and interpretable meaningfully and easily, so compiled
code is ruled out. To demand that each part of each BEING be itself a BEING
would trivially cause an infinite regress. Hence the reliance upon
"intermediate" representations. 

Communication between very different entities, for example between
the User and a BEING not designed to talk with him, are best effected via
a picture language and/or an examples language (from which the
receiver must infer the message). Such universal media are proposed
for faltering communications, for holding and relating intuitions of
the essences of the knowledge chunks stored in the BEINGs. 

The representation of intuitive knowledge as pictures and examples is
certainly not original.  Set theory books usually have pictures of
blobs, or dots with a closed curve around them, representing sets.
For our purposes, a set will be represented in many ways.  These
include pointer structures for ⊗6ε⊗*, ⊂, and their inverses; analytic
geometric functions dealing with sets as equations representing
regions in the plane; prototypical examples of sets; a collection of
abstract rules for simulating the construction and manipulation of
sets; and, finally, a set might be intuitively represented as a
square in the cartesian plane.  All these are in addition to the
definite knowlege about sets (definition, axioms and theorems about
sets, specific analogies to other concepts). 

The notion of a fuzzy rule will remain fuzzy throughout this
document. The basic idea is that of a production system, with left
sides that can latch onto almost anything, which eventually generate
lots of low-certainty results. These would augment some
BEINGs' intuition parts, and when trying to relate two given BEINGs which
both had such fuzzy abstract rules, one might try to "run" the
combined production system, or merely to "compare" the two systems. 
As with pictures and examples, the benefits of universality and
adaptability outweigh the inefficencies. 

Opaque simulations of (about a dozen) real-world situations is another important
component of the representation of intuitive knowledge. For example,
there might be a simulated jigsaw puzzle, with models of pieces,
goals, rules, hints, progress, extending, etc.  There will be a
simulated playground, with a seesaw model that will respond with what
happens whenever anything is done to either side of the seesaw. There
will be flamboyant models, like mountain-climbing; mundane ones like
playing with blocks; etc.  The obvious representation for these simulations
is compiled functions, which are automatically opaque. 

The next items of interest are which parts each BEING must have. 
In PUP6, each BEING had (theoretically) exactly
the same set of parts. In AM, each ⊗4family⊗* will have the same set.
For each possible part, we list below those families having that part:

.BEGIN W(7); TABS 30,40,50,62,74; TURN ON "\"  GROUP
@2   ↓_Part Name_↓\Static\Active\Static Meta\Active Meta\Archetypical
.TABS 34,44,57,69,80
.INDENT 6

⊗6RECOGNITION GROUPING⊗*
 CHANGES\X\X\X\X\X
 FINAL\X\X\X\X\X
 PAST\X\X\X\X\X
 IDEN {not}{quick}\X\X\X\X\X

⊗6ALTER GROUPING⊗*
 GENERALIZATIONS\X\X\X\X
 SPECIALIZATIONS\X\X\X\X
 BOUNDARY\X\X\X\
 DOMAIN/RANGE {not}\\X\\X
 ORDERING(Complete)\\X\X\X
 WORTH\X\X\X\X
 INTEREST\X\X\X\X
 JUSTIFICATION\\\X\
 OPERATIONS\X\X\X\X

⊗6ACT GROUPING⊗*
		CHANGE subgrouping of parts
 BOUNDARY OPERATIONS {not}\X\X\X\
 FILLIN\\\\\X
 STRUCTURE\\\\\X
 CHECK\\\\\X
 ALGORITHMS\\X\\X\
		INTERPRET subgrouping of parts
 REPRESENTATION\X\X\X\X\
 VIEWS\\X\X\\

⊗6INFO GROUPING⊗*
 DEFINITION\X\X\X\X\X
 INTU\X\X\X\X\X
 TIES\X\X\X\X\X
 EXAMPLES {not} {bdy}\X\X\X\X\
 CONTENTS\X\X\X\X\

⊗1**********************************************************************⊗*

.END

.NSECP(Flow of Control in AM)

.SELECT 1

Control in the system: As AM runs, at each moment, each concept
will have only a few of its facet programs filled in; most of the
facets of most of the concepts will be unknown. AM's only control
structure is to repeatedly choose some facet of some concept
(some part of some BEING), and
then use the appropriate strategic knowledge to fill in that missing
program.  The strategic knowledge will typically access and run many
other facet programs from many other concepts.  In the course of
this, new concepts may be constructed and worth giving names to$$
If one part of a BEING becomes large and interesting, it may be split into
several brand new BEINGs. This is how new concepts develop.   For
example, in filling out the Examples part of the BEING 
dealing with Composition, the system will have to think up new compositions
and relations. If one of them turns out to be interesting, it will be
made into a new BEING, and almost all of its 25 parts will be blank. So the
system will then have 25  well-defined questions to try to answer about
a specific, promising new concept.
The Examples part of the Composition BEING would now contain a pointer to
this new BEING.*.
Whenever this happens, the new concept has almost all his facets
blank, which means AM now has about 20 specific tasks to eventually
attend to (if they're deemed interesting enough). So the system never
runs out of things to do; rather, that number keeps growing rapidly. 
It is the judgmental criteria which must limit this otherwise-explosive
growth.


AM is interactive: AM informs a human user of the things it finds
which it thinks are interesting.  The user can interrupt AM and
interrogate it, redirect its energies, and so on. Since the user has
several roles$$
Creator (decide what knowledge AM starts with), Guide (by encouragement,
discouragement, suggestion, hint), Absolute Authority (provide needed facts
which AM couldn't derive itself), Co-researcher (if AM is operating in a
domain unknown to the user). *,
AM should have several modes of communicating with him.$$
Traditional math
notation, textbook English, formal (e.g. AUTOMATH or predicate
calculus), pictorial (simulated visual intuitions), examples (I/O pairs, traces).*
Protocols indicate this is feasable; one should
only require a few minutes' instruction before 
comfortably using the system. 

.XGENLINES←XGENLINES-3

.SSEC(Belief and Justification)

The system will justify what it believes with ⊗4intuition⊗* and ⊗4empirical
evidence⊗* as well as with formal reasoning. This justification 
would be recomputed only when needed (e.g., if an apparent
contradiction arose).$$
The aim is
not to build a theorem-prover, yet the leap from "very probable" to 
"certain" is not ignorable.  Many statements are infinitely probable yet
silly. Some sophisticated studies into this problem have
been done [Pietarinen] and may prove usable. 
The mechanism for belief in each fact, its certainty, should be
descriptive (a collection of supporting reasons) with a vector of numerical
probabiities (estimated for each factor) attached. These numbers
would be computed at creation of this
entity, recomputed only as required.   The most fundamental entities may
have ↓_only_↓ numerical weights. 
If the weight of any entity changes, no "chasing
around" need be done. Contradictions are not
catastrophic: they simply indicate that the reasons supporting each of the
conflicting ideas should be reexamined, their intuitive and formal
justifications scrutinized, until the "sum" of the ultimate beliefs in
the contradictory statements falls below unity, and until some intuitive
visualization of the situation is accepted.
If this never happens, then a problem really exists here, and might
have to be assimilated as an exception to some rule, might decrease the
reliability placed on certain facts and methods, might cause the rejection
of some mathematical theory as inconsistent after all, etc.
This belief algorithm, whatever its details, should be embedded implicitly in the
control environment; AM should not have the power to inspect or
modify it. *
Intuition will be simulated by functions which are opaque$$ Their knowledge is
not inspectable by the rest of the system; e.g., compiled code * and fallible.
The footnote
below gives a brief example of how they might work.$$ Consider the intuition about
a seesaw.   This is useful for visualizing anti-symmetry and symmetry,
associativity, and the concept of multiplication (distance x weight).
Our seesaw function will accept some features of a seesaw scenario:
details about each person involved in the scene (exactly where they were,
what their names were, how much they weighed)  and about the initial and final
position of the seesaw board. The intuition function SEESAW 
would accept a ↓_partial_↓ list of such
features, and then return a fully-filled out list. If the final position of
the seesaw 
is one of the omitted parameters,
the function will compute the sums of (weight x distance)
on each side, and decide what happened to the board. If some feature like a person's
name was omitted, it will fill that in by randomly choosing some name. Of course,
the caller of the function doesn't know which features are irrelevant; he doesn't
know which details of the gestalt are computed and which are random. *
Here is the most serious attack on the reliance upon divinely-provided 
intuitive abilities:
we creators might stack the deck; we might contrive just the right intuitions to
drive the worker toward making the "proper" discoveries.  The rebuttal is two-pronged:
first, one must assume the integrity of the creators; they must strive not
to anticipate the precise uses that will be made of the intuition functions. Second,
regardless of how contrived it was, if a small set of intutition models were found
which are sufficient to drive a researcher to discover a significant part of
mathematics, that alone would be an interesting discovery 
(educators would like to ensure
that children understood this core of images, for example).


.XGENLINES←XGENLINES-3

.SSEC(The Environment)

A rather specialized ⊗4environment⊗* exists to support these BEINGs. Encoded as
efficient opaque functions, the environment must oversee the flow of control
in the system (although the BEINGs themselves make each specific decision as to
who goes next). It must also include evaluations of belief, interest, supeficiality,
safety, utility; it must keep brief statistics on when and how each part of each
BEING is accessed; and the environment must maintain a rigidly-formatted
description of the Current Situation (abbreviated CS; this 
structure also includes summaries of recent system history).
When a part is big and heavily accessed, detailed records
must be kept of each usage (how, why, when, final result) of each ⊗4subpart⊗*.
Based on this, the part may be split into a group of new BEINGs, and the value
of the part replaced by a pointer to a list of these new BEINGs. 

The environment would have to accept the returning messages of the attempt to
deal with a certain part of a certain BEING. A success or a failure would mean
backing up to the last decision and re-making it
(usually the top-level "select (P,B) to work on next" decision).
An "interrupt" from a trial
would mean "here is some possibly more interesting info". The environment
must decide if it is in fact judged to be more interesting; 
if not, control is returned to the interrupted process.
If so, control automatically switches to that new, interesting part.
Later, there will be no automatic return to the interrupted
process, but whatever sequence of decisions led to its initiation may very
probably lead there again.
Two tricks are planned here. One is a cache: each BEING will let  its
RECOG parts store the last value computed, and let each
such part have a quick predicate which can
tell if any feature of the world has changed which might affect this value.
If not, then no work is done; the old value is simply
returned. If x is interrupted,
an auxilliary development is begun, and then work on x should continue,
most of the decisions leading back to x will probably not involve any real
work, since most of the world hasn't changed. The second trick is that to
evaluate a part, one moves down its code with a cursor, evalling. When
interrupted, that cursor is left just at the point one wants to start at when
the work resumes.

New BEINGs are created automatically if, when a part is evaluated and a new entity
formed, the entity has sufficient interest to make it worth keeping by name.
Also, an existing part P of a BEING B may be replaced by a list of new BEINGs.
The details of when and how to do this restructuring of B.P are stored under the 
Structure part of the archetypical BEING whose name is P. The typical process is as follows:
The environment keeps loose checks on the size and usage of each part; if one
ever grows and eats up much time, it is carefully monitored. Eventually, its
subparts may partition into a set whose usage is nearly disjoint. If this
is seen, then the part is actually split into a new set of BEINGs.
If a new BEING doesn't live up to its expectations, it may be destroyed or enslaved
(overwritten and forgotten; perhaps enough is remembered to not waste
time later on the same concept; perhaps it is denegrated into just a subpart of some
part of another BEING.)

Here is a more detailed (but still tentative) description of some of these top-level
environment routines:

COMPLETE(P,B) means fill in material in part P of BEING B. 

.QP2←PAGE

@21. Locate P and B.⊗*
If P is unknown but B is known, ask B.ORDERING
and up⊗A*⊗*(B).ORDERING. Also,
there may be special information
stored in some part(s) of B earlier, by other BEINGs, which make them more or less
promising to work on now$$
up↑n(B).P means the set of BEINGs named P, B.P, 
(B.Ties.Up).P,
((B.Ties.Up).Ties.Up).P,
etc.*.

If B is unknown but P is known, ask P and ask each BEING about interest of filling in P.
Each BEING runs a quick test to see if it is worth doing a detailed examination.
Sometimes the family of B will be known (or at least constrained).

If neither is known, each BEING must see how rele. it is; the winner decides on P.
If there is more than one BEING tied for top recognition, then place the results
in order using the environment function ORD, which examines the Worth components
of each, and by using the value of the most promising part to work on next for each
BEING. The frequent access to the (best part, value) pair for each BEING means that
its calculation should be quick; in general, each BEING will recompute it only when new
info. is added to some part, or at rare intervals otherwise.
After ranking this list, chop it off at the first big break in values, and print it out
to the user to inspect. Pause WAIT seconds, then commence work on the
first in the list. 
WAIT is a parameter set by the user initially$$   0 would mean go on unless user
interrupts you, infinity would mean always wait for user's reply, etc.*.
When you finish, don't throw the list away until after the
next B is chosen, since the list might immediately need to be recomputed! 
If the user
doesn't like the choice you've made, he can interrupt and switch you over.
A similar process occurs if P is unknown, (except the list is never saved).

@22. Collect pointers to helpful information: ⊗*
 Create a (partialy ordered) plan for dealing with part P of BEING B (abbreviated B.P).
 This includes the P.FILLIN part, and in fact any existing up⊗A*⊗*(B).P.FILLIN, and
 also some use of the representation, defn, views, dom/range parts of the P BEING.
 Consult ALGORITHMS and FILLIN parts of B and all upward-tied BEING's to B.

@23. Decide what must be done now⊗*; 
 which of the above pieces of information is "best". Tag it as having been tried.
 If it is precisely = one currently active goal, then forget it and go to 3.

@24. Carry out the step.⊗* (Evaluate the interest of any new BEING when it is created)
 Notice that the step might in turn call for accessing and (rarely) filling
 in parts of other BEINGs. This activity will be standard heirarchical calling.
 As parts of other BEINGs are modified, update their (best part, value) estimate.

@25. When done, update.⊗*
 Update statistics in B, P, and current situation. (worth and recog parts)
 If we are through dealing with B.P (because of higher interest entity ∃,
 or because the part is filled in enough for now) goto 1; else goto 3.
 If you stop because of higher interest entity, save the plan for P.B inside P.B.


CURRENT SITUATION is a vector of weights and features of the recent behavior of the system.
.FILL

The Environment also maintains a list of records
and statistics of the recent past activities, in a structure called CS, 
for "Current Situation".
Each Recognition grouping part is prefaced by a vector of numbers which are
dot-multiplied into CS, to produce a rapid rough guess of relevance.
Only the best performers are examined more closely for relevance.
The representation of each CS component is (identification info, motivation,
safety, interest, work done so far on it, final result or outlook). The
actual components might be:
.NOFILL; PREFACE 0

Recent Accesses.   For each, save (B, P, contents of subpart used).
Recent Fillins.    Save (B, P, old contents which were altered).
Current Hierarchical History Stack.  Save  (B, P, why).
Recent Top-level B,P pairs.
A couple significant recent but not current hierarchical (B,P,why) records.
A backward-sorted list of the most interesting but currently-deferred (B,P) fillins.
A few recent or collossal fiascos (B, P, what, why this was a huge waste).


ORD(B,C)  Which of the recognition-tied BEINGs B,C is potentially more worthwhile?

.FILL

This simple ordering function will probably examine the Worth vectors,  perhaps
involving the sum of weighted factors, perhaps even cross-terms such as
(probability of success)*(interest rating).

.BEGIN SELECT 6; NOFILL; PREFACE 0


PLAUSIBILITY(z)       How believable is z?    INTEREST(z)    How interesting is z?

         each statement has a probability weight attached to it, the degree of belief
         this number is a fn. of a list of justifications
	 Polya's plausibility axioms and rules of inference
         if there are several alternate justifs., it is more plausible
         if some consequences are verified, it is more plaus.
         if an analogous prop. is verified, it is more plaus.
         if the consequences of analogue are verif., it is slightly more plaus.
         the converses of the above also hold
         believe in those things with high enough prob. of belief (rely on them)
         this level should fluctuate just above the point of belief in contradictions
         the higher the prob., the higher the reliability
         the amt. one bets should be prop. to the reliability
         the interest increases as the odds do
         Zadeh: p(∧) is min, p(⊗6∨⊗*) is max, p(¬) is 1-.
         Hintikka's formulae (λ, αα)
         Carnap's formulas (λ)
         p=1 iff both the start and the methods are certain ←← truth
         p=0 iff both start is false and method is false-preserving ←← falsity
	 p is higher as the plausibility is higher, and as the interest is lower
         if ∃ several alternative plaus. justifs., p is higher
         don't update p value unless you have to
         update p values of contradictory props.
         update p values of new props
         maybe update p value if it is a reason for a new prop
      empiricism, experiment, random sampling, statistics
         true ideas will be "verified" in (consistent with) any and all experiments
         false ideas may only have a single exceptional case
	 a single exception makes a universal idea false
         nature is fair, uniform, nice, regular; coincidences have meaning
         more plaus. the more cases verified
         more plaus. the more diff. types of cases verified
         central tendency (mean, mode, median)
         standard deviation, normal distribution
         other distributions (binomial, Poisson, flat, bimodal)
         statistical formulae for significance of hypothesis
      regularity, order, form, arrangement
         economy of description means regularity exists
         aesthetic desc (ana. to known descs. elsewhere)
         each part of desc. is organized regularly
         the parts are related regularly

  Below, αα means ⊗4increases with increasing⊗* (proportionality), and
  αα⊗A-1⊗* means ⊗4decreases with increasing⊗* (inversely proportional).
  Perhaps one should distribute these morsels among the various concerned BEING's:
   Completeness of an analogy  αα  safety of using it for prediction
   Completeness of an analogy  αα⊗A-1⊗* how interesting it is
   How expected a relationship is  αα⊗A-1⊗*  how interesting it is
   How intuitive a conjec/relationship is  αα⊗A-1⊗*  how interesting it is
   How intuitive a conjec/relationship is  αα  how certain/safe it is
   How superficial something is  αα  how intuitive it is
   How superficial something is  αα  how certain it is
   How superficial something is  αα⊗A-1⊗* how interesting it is

  Perhaps included here should be universally applicable algorithms for applying these rules
  to choosing the best strategies, as a function of the situation.

   One crude estimate of interest level is the interest component of the current BEING's
   Modify this estimate in close cases using the above relations
   Generally, choose the most specific strategies possible
   If the estimated value of applying one of these falls too low, try a more general one
   Rework the current BEING slightly, if that enables a much more specific strategy to be used
   Locate specific concepts which partially instantiate general strategies
   The more specific new strategies are associated with the specific info. used
   Once chosen, use the strategies on the most promising specific information
   If a strat. falters: Collect the names of the specific, needed but blank (sub)parts
      Each such absence lowers int. and raises cost, and may cause switch to new strategy
      If too costly, low int, store pointer to partial results in blank parts 
         The partial results maintain set of still-blank needed parts

   Competing goals: On the one hand, desire to maximize certainty,
      safety, complete analogies, advance the level of intuition.
      On the other hand, desire to maximize interestingness, find poss. and poten. interesting ana.
       find unexpected, nonsuperficial, and unintuitive relationships.
   If an entity is used frequently, it should be made efficient.
      Conversely, try to use efficient entities over nearly
      equivalent (w.r.t. given purpose) but inefficient ones.
   If an entity is formally justified but resists intuitive comprehension, its use is
      dangerous but probably very interesting; ibid for intuitive but unprovable.
   Resolve choices in favor of aesthetic superiority

   Maximize net behavior
    Maximize desired effects
      In this case, prefer hi interest over hi safety.
      Generally preferred to the folowing case.
    Minimize costs, conserve resources
      In this case, prefer safety to interest.
      Locate the most inefficient, highest-usage entity, and improve or replace it
      Use: If time/space become a problem, worry about conservation until this relaxes.
.END

.NSECP(Details: Getting AM going)

This diversion is only for the interested reader. It may be skipped without
loss of continuity. It deals with short-cuts which can be taken by myself
to get AM running faster and more easily.

When AM proposes a conjecture, the user may interrupt and say "Assume it is true,
and go on". If he does this consistently, we can debug AM's higher-level
proposing abilities even before its lower-level proving abilities are perfected.

We may be able to beat the "critical mass" constraint slightly.
The vast amount of necessary initial knowledge can be 
generated from a much smaller
core of intuition and definite facts, using the same collection of
strategies and wisdom which also drive the later discovery and the 
development.
Since AM has been given good knowledge-acquisition (strategic) powers,
AM itself should be able to fill in the blank parts of the concepts it starts
with.$$
When you are preparing the knowledge to initially get AM going,
the way to save some effort is by 
omitting whole categories of knowledge which you can expect
the system to discover quickly for itself.
AM has to have good methods for generating examples of whatever
concepts it later comes up with; those same methods can be used to produce
examples of the initially-supplied concepts. This means that the creators can
omit all examples of the concepts originally supplied; the system
will be able to fill in all such gaps.
Notice that AM has to be able to fill in ↓_all_↓ the parts (slots)
of each newly discovered concept, so perhaps only the barest definition or
intuition need be initially taught about each concept.*
The only constraint on 
such a partially-equipped
system is that it first develop a broad base of
intuition, examples, and interrelations, spanning several areas of interest,
before trying to  develop any one particular area in depth.

.COMMENT XGENLINES←XGENLINES-2;

A balanced distribution of intelligence ought to exist: related facets which
are rarely accessed separately should be coalesced, and any single type of facet
used very heavily should be split. Notice that this theme has two quite different
realizations. During run time, this policy would refer to the contents held in
existing BEINGs' parts$$ E.g., the Structure part exists only to indicate what to
do about a BEING's 
part getting too big.*. Before the system is actually created, however,
this policy means altering the proposed anatomy of BEINGs $$ Proclaiming that some
family has to have this extra part present, that these three parts can be replaced
(in all BEINGs of a given family) by the following more general part, etc.*.
The runtime restructurings occur based on knowledge recently created by the
system; the planning-stage restructurings are now being based on results from
hand simulations of the system.  Remember: during runtime, the set of parts that
any specific BEING can ever have is fixed by the family (profession) to which he
belongs.

One difference from PUP6 is that in AM the BEINGs are grouped into families.
Each family has its own set of parts (although there will be many parts present in
many families, e.g. 
Iden). For each family F there will be a fairly general BEING named
F. Under each part of F
is general information which, 
though not applicable to all BEINGs, is applicable to all
BEINGs belonging to family F.  
Similarly, if P is a part name, then the BEING named P contains
information which is useful for dealing with part P of any BEING. There might also exist
an archetypical BEING named F.P, who would have special information for working with
part P of any BEING in family F.  There might even be a BEING called B.P, where B is
some specific BEING, with information that just deals with part P of B and any future
specializations of B. The information stored inside a part of a BEING, for example
the actual contents of B.P, would be 
code capable of computing
B's answer to the question P; the previously
mentioned archetypical BEING named B.P 
would contain strategies for dealing with such
answering code (how to fill it in, how to check it, etc.).
To reiterate:   the contents of a part
are specific knowledge, a little program which can answer a specific query, whereas
the contents of the parts of an archetypical BEING are partially ordered sets
of strategies
for dealing with that part of that type of BEING (how to
extend it, what its structure is, and so on).
Notice we are saying that all the parts with
the same part name, 
of BEINGs in the same family, must all have the same structure$$
This is one additional level of structure from the BEINGs proposed in PUP6.*.

.SELECT 1

When part p of BEING B is filled out, at some point in the sequence S of strategies
listed under the archetypical BEING named B.p or p, some new 
information may be discovered. If S cannot handle this knowledge, then  it will
simply return with the message "I am not through, but here is some fact(s) which
may mean that filling out part p of B is no longer the best activity".
The selected part and BEING may turn out the same, or may change due to the new
info which was just uncovered.
The flavor of the return should thus be one of: Not Done because x is
possibly more interesting; Not Done because x is a prerequisite to doing me;
Done because I succeeded; Done because I failed utterly.

The lower-level BEINGs will provide fast access to well-organized information.
The background environment provides the necessary evaluation services at
high speeds (though the system cannot meaningfully examine, modify, or add to
what environment functions the creators provide).
The BEINGs hold "what to think"; the environment implicitly controls "how to think",
and the archetypical BEINGs explicitly contain hints for "how to think".
The big assumption is that one may think creatively without completely 
knowing how his thought
processes operate; intelligence does not demand absolute introspection.

Each clump is (at least partially) ordered, hence can be executed sequentially.
The result may be to choose a lower-level clump, and/or modify some strategies
at some level (some part of some BEING), and/or create new strategies at some
level (perhaps even to create a new BEING). These lattter creations and calls
will be in the form of strong suggestions to the environment.

Common knowledge should in some cases be factored out. Possibilities:
(i) always ask a specific BEING, who sometimes queries a more general
one if some knowledge is missing; (ii) always query the most general
BEING relevant, who then asks some specific ones (This sounds bad);
(iii) ask all the BEINGS pseudo-simultaneously, and examine the
responders (this sounds too costly.) The organization of BEINGs into
hierarchical groupings reflects the spirit of (ii). A BEING only
contains additions and exceptions to what its generalization contains,
so (i) is actually the dominant scheme now envisioned.

.SELECT 1

There are two kinds of discovery, evolution of abilities, which are specifically
disallowed: (i) the modification of the strategies and
control mechanism themselves$$
Since the precise strategies are so crucial, it might be
advantageous to allow them to evolve. This includes changing the system's
notion of interestingness as its experience grows.
This  was decided against because our thesis is that the same strategies
useful for dealing with premathematical concepts should suffice no matter how
far the system progresses.   One mechanism to effect this kind of development
is the standard sort of feedback paradigm, which in AM's case seems
one full step too ambitious.
Intuitions and strategies may be inspected and altered, just as any other
facets of BEINGs, but
the notions of how to judge interestingness, belief, safety, difficulty, etc. 
(plus all the algorithms for split-msecond estimating and updating these)
are fixed for
AM by its creator. If they are unsatisfactory, he must retune them.
*, and (ii) the introduction of brand new ⊗4kinds⊗* of
questions, of new parts of a type not forseen at time of system creation.$$
The postulating of new kinds of parts is a very tricky process, and like the
first exclusion, I feel that the kinds of things one wants to know about
a concept shouldn't vary too much. A mechanism for introducing new kinds of
BEING parts is the following:
whenever a very interesting property P is discovered, with interesting variations,
which applies to many BEINGs, then assert that P is a new
part (slot, question), and that every BEING which has some variation of 
this property should fill
in a slot labelled P, with a description of its variation. 
The common link between all the
BEINGs having  property P should be found, and any new BEING possessing this
characteristic should eventually try to see what variation of P it 
possesses. That is, it should try to fill in a slot labelled P. *

.NSECP(Examples of Individual Modules)

Perhaps it would be beneficial to glance over the file GIVEN at this point.
Each page there contains information relevant to one concept, broken down by facets.
Since that's a couple hundred pages long, we'll just glance at two
typical BEINGs: EXAMPLES and COMPOSITION.
All the 150 BEINGs in GIVEN will eventually be coded by hand and fed to AM.
Since the BEINGs are all considered equal, there is in fact no atypically
interesting one which can be exhibited. So the two below are just mediocre;
they aren't worth poring over.

.SSEC(The "EXAMPLE" module)

<Look at Examples BEING, from the Given file>
This is the BEING named ANY-BEING.EXAMPLES.
Notice the general hints listed under FILLIN, for filling in the examples
of any given concept. β is an abbreviation for BEING. CHECK ex